Hierarchical Feature Selection with Recursive Regularization

نویسندگان

  • Hong Zhao
  • Pengfei Zhu
  • Ping Wang
  • Qinghua Hu
چکیده

In the big data era, the sizes of datasets have increased dramatically in terms of the number of samples, features, and classes. In particular, there exists usually a hierarchical structure among the classes. This kind of task is called hierarchical classification. Various algorithms have been developed to select informative features for flat classification. However, these algorithms ignore the semantic hyponymy in the directory of hierarchical classes, and select a uniform subset of the features for all classes. In this paper, we propose a new technique for hierarchical feature selection based on recursive regularization. This algorithm takes the hierarchical information of the class structure into account. As opposed to flat feature selection, we select different feature subsets for each node in a hierarchical tree structure using the parent-children relationships and the sibling relationships for hierarchical regularization. By imposing `2,1-norm regularization to different parts of the hierarchical classes, we can learn a sparse matrix for the feature ranking of each node. Extensive experiments on public datasets demonstrate the effectiveness of the proposed algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mental Arithmetic Task Recognition Using Effective Connectivity and Hierarchical Feature Selection From EEG Signals

Introduction: Mental arithmetic analysis based on Electroencephalogram (EEG) signal for monitoring the state of the user’s brain functioning can be helpful for understanding some psychological disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, or dyscalculia where the difficulty in learning or understanding the arithmetic exists. Most mental arithmetic recogni...

متن کامل

Technical Report Yr-2008-002 Multi-class Feature Selection with Support Vector Machines

We consider feature selection in a multi-class setting where the goal is to find a small set of features for all the classes simultaneously. We develop an embedded method for this problem using the idea of scaling factors. Training involves the solution of a convex program for which we give a scalable algorithm. The method is closely related to extensions of L1 regularization and recursive feat...

متن کامل

Multi-Class Feature Selection with Support Vector Machines

We consider feature selection in a multi-class setting where the goal is to find a small set of features for all the classes simultaneously. We develop an embedded method for this problem using the idea of scaling factors. Training involves the solution of a convex program for which we give a scalable algorithm. The method is closely related to extensions of L1 regularization and recursive feat...

متن کامل

Bridging the semantic gap for software effort estimation by hierarchical feature selection techniques

Software project management is one of the significant activates in the software development process. Software Development Effort Estimation (SDEE) is a challenging task in the software project management. SDEE is an old activity in computer industry from 1940s and has been reviewed several times. A SDEE model is appropriate if it provides the accuracy and confidence simultaneously before softwa...

متن کامل

Simulation-based Regularized Logistic Regression

In this paper, we develop a simulation-based framework for regularized logistic regression, exploiting two novel results for scale mixtures of normals. By carefully choosing a hierarchical model for the likelihood by one type of mixture, and implementing regularization with another, we obtain new MCMC schemes with varying efficiency depending on the data type (binary v. binomial, say) and the d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017